Search CORE

333 research outputs found

Autofix for backward-fit sidechains: using MolProbity and real-space refinement to put misfits in their place

Author: BW Arendall III
D Tu
Daniel A. Keedy
David C. Richardson
G Vriend
H Bedem van den
HM Berman
IW Davis
IW Davis
Jane S. Richardson
Jeffrey J. Headd
JM Word
JM Word
K Joosten
P Emsley
Paul Emsley
PD Adams
R Gross-Kunstleve
RA Laskowski
RJ Morris
Robert M. Immormino
RP Bahadur
SC Lovell
SC Lovell
TA Jones
TC Terwilliger
Publication venue: Springer Netherlands
Publication date: 01/01/2008
Field of study

Misfit sidechains in protein crystal structures are a stumbling block in using those structures to direct further scientific inference. Problems due to surface disorder and poor electron density are very difficult to address, but a large class of systematic errors are quite common even in well-ordered regions, resulting in sidechains fit backwards into local density in predictable ways. The MolProbity web site is effective at diagnosing such errors, and can perform reliable automated correction of a few special cases such as 180° flips of Asn or Gln sidechain amides, using all-atom contacts and H-bond networks. However, most at-risk residues involve tetrahedral geometry, and their valid correction requires rigorous evaluation of sidechain movement and sometimes backbone shift. The current work extends the benefits of robust automated correction to more sidechain types. The Autofix method identifies candidate systematic, flipped-over errors in Leu, Thr, Val, and Arg using MolProbity quality statistics, proposes a corrected position using real-space refinement with rotamer selection in Coot, and accepts or rejects the correction based on improvement in MolProbity criteria and on χ angle change. Criteria are chosen conservatively, after examining many individual results, to ensure valid correction. To test this method, Autofix was run and analyzed for 945 representative PDB files and on the 50S ribosomal subunit of file 1YHQ. Over 40% of Leu, Val, and Thr outliers and 15% of Arg outliers were successfully corrected, resulting in a total of 3,679 corrected sidechains, or 4 per structure on average. Summary Sentences: A common class of misfit sidechains in protein crystal structures is due to systematic errors that place the sidechain backwards into the local electron density. A fully automated method called “Autofix” identifies such errors for Leu, Val, Thr, and Arg and corrects over one third of them, using MolProbity validation criteria and Coot real-space refinement of rotamers

Crossref

Springer - Publisher Connector

PubMed Central

Oxford University Research Archive

PON-SC – program for identifying steric clashes caused by amino acid substitutions

Author: A Niroula
A Niroula
A Niroula
A Niroula
A Niroula
A Trovato
C O'Fagain
CS Poultney
D Frishman
DE Goldgar
E Eyal
EC Chao
F Pedregosa
FA Kondrashov
FO Desmet
G Jeffrey
GG Krivov
H Ali
HM Berman
I Lappalainen
J Cheng
J Pottel
J Thusberg
J Thusberg
J Thusberg
J Väliaho
JD Wright
Jelena Čalyševa
JM Schwarz
JM Word
JM Word
JS Richardson
K Laurila
K Nagata
KP Tan
M Kircher
M Mort
M Vihinen
M Vihinen
M Vihinen
Mauno Vihinen
NM Lindor
O Conchillo-Sole
PJ Cock
PS Nair
R Rouet
RD Socha
RW Hooft
S Yin
SC Lovell
Y Yang
Z Shi
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Protein Design Using Continuous Rotamers

Author: AE Eriksson
AR Leach
B Kuhlman
B Kuhlman
BI Dahiyat
BR Donald
Bruce R. Donald
C Chen
C Wang
DA Pearlman
DB Gordon
DJ Huggins
G Wang
I Georgiev
I Georgiev
I Georgiev
I Georgiev
J Desmet
J Desmet
J Word
JM Word
JT Kellis Jr
K Raha
KE Roberts
KM Frey
KW Kaufmann
Kyle E. Roberts
L Jiang
MJ Gorczynski
NA Pierce
Pablo Gainza
R Abagyan
R Goldstein
R Lilien
RH Lilien
S Henikoff
S Hubbard
Sarah A. Teichmann
SC Lovell
SM Lippow
T Harder
T Kortemme
T Lazaridis
VB Chen
W Sheffler
X Hu
Y Dehouck
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Optimizing amino acid conformation and identity is a central problem in computational protein design. Protein design algorithms must allow realistic protein flexibility to occur during this optimization, or they may fail to find the best sequence with the lowest energy. Most design algorithms implement side-chain flexibility by allowing the side chains to move between a small set of discrete, low-energy states, which we call rigid rotamers. In this work we show that allowing continuous side-chain flexibility (which we call continuous rotamers) greatly improves protein flexibility modeling. We present a large-scale study that compares the sequences and best energy conformations in 69 protein-core redesigns using a rigid-rotamer model versus a continuous-rotamer model. We show that in nearly all of our redesigns the sequence found by the continuous-rotamer model is different and has a lower energy than the one found by the rigid-rotamer model. Moreover, the sequences found by the continuous-rotamer model are more similar to the native sequences. We then show that the seemingly easy solution of sampling more rigid rotamers within the continuous region is not a practical alternative to a continuous-rotamer model: at computationally feasible resolutions, using more rigid rotamers was never better than a continuous-rotamer model and almost always resulted in higher energies. Finally, we present a new protein design algorithm based on the dead-end elimination (DEE) algorithm, which we call iMinDEE, that makes the use of continuous rotamers feasible in larger systems. iMinDEE guarantees finding the optimal answer while pruning the search space with close to the same efficiency of DEE. Availability: Software is available under the Lesser GNU Public License v3. Contact the authors for source code

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

California Men's Health Study (CMHS): a multiethnic cohort in a managed care setting

Author: AC Society
AE Hak
AR Folsom
AR Kristal
AR Kristal
AR Kristal
Barbara Sternfeld
BC Choi
Bette J Caan
BH Hage
CA Derby
Charles P Quesenberry
CL Meinert
DL Word
Donna M Schaffer
DRJ Jacobs
DRJ Jacobs
E Giovannucci
E White
GA Colditz
HO Adami
JM Samet
JM Yuan
JR Hunt
KH Schmitz
KJ Rothman
Laurel A Habel
Marianne C Sadler
MJ Barry
NA Ponce
NE Breslow
P Verhoef
PH Gann
PL Horn-Ross
RE Patterson
Ronald K Loo
S Greenland
S Sidney
Sarah Rowell
Shelley M Enger
Stephen K Van Den Eeden
W Zheng
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We established a male, multiethnic cohort primarily to study prostate cancer etiology and secondarily to study the etiologies of other cancer and non-cancer conditions. METHODS/DESIGN: Eligible participants were 45-to-69 year old males who were members of a large, prepaid health plan in California. Participants completed two surveys on-line or on paper in 2002 – 2003. Survey content included demographics; family, medical, and cancer screening history; sexuality and sexual development; lifestyle (diet, physical activity, and smoking); prescription and non-prescription drugs; and herbal supplements. We linked study data with clinical data, including laboratory, hospitalization, and cancer data, from electronic health plan files. We recruited 84,170 participants, approximately 40% from minority populations and over 5,000 who identified themselves as other than heterosexual. We observed a wide range of education (53% completed less than college) and income. PSA testing rates (75% overall) were highest among black participants. Body mass index (BMI) (median 27.2) was highest for blacks and Latinos and lowest for Asians, and showed 80.6% agreement with BMI from clinical data sources. The sensitivity and specificity can be assessed by comparing self-reported data, such as PSA testing, diabetes, and history of cancer, to health plan data. We anticipate that nearly 1,500 prostate cancer diagnoses will occur within five years of cohort inception. DISCUSSION: A wide variety of epidemiologic, health services, and outcomes research utilizing a rich array of electronic, biological, and clinical resources is possible within this multiethnic cohort. The California Men's Health Study and other cohorts nested within comprehensive health delivery systems can make important contributions in the area of men's health

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Rational Design of Temperature-Sensitive Alleles Using Computational Structure Prediction

Author: B Cunningham
B Lee
C Cortes
Ca Rohl
Christopher S. Poultney
CJ Burges
David Gresham
Dennis E. Shasha
EH Kellogg
G Chakshusmathi
Glenn L. Butterfoss
HM Muller
JM Word
JR Quinlan
K Bajaj
K Drew
KD Pruitt
Kevin Drew
Kristin C. Gunsalus
M Hall
Michelle R. Gutwein
N Eswar
N Siew
R Varadarajan
Richard Bonneau
RJ Dohmen
S Tweedie
SF Altschul
SF Altschul
TW Harris
Vladimir N. Uversky
WS Noble
WS Sandberg
Publication venue: Public Library of Science
Publication date: 02/09/2011
Field of study

Temperature-sensitive (ts) mutations are mutations that exhibit a mutant phenotype at high or low temperatures and a wild-type phenotype at normal temperature. Temperature-sensitive mutants are valuable tools for geneticists, particularly in the study of essential genes. However, finding ts mutations typically relies on generating and screening many thousands of mutations, which is an expensive and labor-intensive process. Here we describe an in silico method that uses Rosetta and machine learning techniques to predict a highly accurate “top 5” list of ts mutations given the structure of a protein of interest. Rosetta is a protein structure prediction and design code, used here to model and score how proteins accommodate point mutations with side-chain and backbone movements. We show that integrating Rosetta relax-derived features with sequence-based features results in accurate temperature-sensitive mutation predictions

Public Library of Science (PLOS)

Crossref

PubMed Central

Computational Design of a PDZ Domain Peptide Inhibitor that Rescues CFTR Activity

Author: A Leaver-Fay
A Piserchio
A Taddei
AJW te Velthuis
AR Leach
B Brannetti
B Kuhlman
BD Allen
BI Dahiyat
BI Dahiyat
BR Brooks
BR Donald
Bruce R. Donald
C Chen
C Lee
C Yanover
CA Smith
CL Kingsford
D Saro
DA Case
DB Gordon
DB Gordon
Dean R. Madden
DM Cholon
DN Sheppard
DT Jones
E Althaus
E Bruscia
E Hong
E Kim
FV Goor
Giorgio Colombo
GK Hom
H Kamisetty
HM Sampson
I Georgiev
IN Berezovsky
J Cheng
J Cheng
J Desmet
J Janin
J Reina
J Thomas
J Zhang
JM Word
JM Word
JR Desjarlais
JW Ponder
KA Reynolds
KM Frey
Kyle E. Roberts
L Vouilleme
LA Joachimiak
M Dayhoff
M Fromer
M Gilson
M Wolde
MD Altman
MJ Gorczynski
N Pedemonte
P Gainza
P Humbert
P Koehl
P Koehl
Patrick R. Cushing
PR Cushing
PR Cushing
Prisca Boisguerin
R Goldstein
RL Dunbrack
SC Lovell
SJ Weiner
SM Lippow
SM Rowe
T Lazaridis
T Ma
U Wiedemann
WB Guggino
X Jiang
Y Li
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The cystic fibrosis transmembrane conductance regulator (CFTR) is an epithelial chloride channel mutated in patients with cystic fibrosis (CF). The most prevalent CFTR mutation, ΔF508, blocks folding in the endoplasmic reticulum. Recent work has shown that some ΔF508-CFTR channel activity can be recovered by pharmaceutical modulators (“potentiators” and “correctors”), but ΔF508-CFTR can still be rapidly degraded via a lysosomal pathway involving the CFTR-associated ligand (CAL), which binds CFTR via a PDZ interaction domain. We present a study that goes from theory, to new structure-based computational design algorithms, to computational predictions, to biochemical testing and ultimately to epithelial-cell validation of novel, effective CAL PDZ inhibitors (called “stabilizers”) that rescue ΔF508-CFTR activity. To design the “stabilizers”, we extended our structural ensemble-based computational protein redesign algorithm to encompass protein-protein and protein-peptide interactions. The computational predictions achieved high accuracy: all of the top-predicted peptide inhibitors bound well to CAL. Furthermore, when compared to state-of-the-art CAL inhibitors, our design methodology achieved higher affinity and increased binding efficiency. The designed inhibitor with the highest affinity for CAL (kCAL01) binds six-fold more tightly than the previous best hexamer (iCAL35), and 170-fold more tightly than the CFTR C-terminus. We show that kCAL01 has physiological activity and can rescue chloride efflux in CF patient-derived airway epithelial cells. Since stabilizers address a different cellular CF defect from potentiators and correctors, our inhibitors provide an additional therapeutic pathway that can be used in conjunction with current methods

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

HAAD: A Quick Algorithm for Accurate Prediction of Hydrogen Atoms in Protein Structures

Author: A Sali
A Verma
AA Canutescu
AA Kossiakoff
AA Kossiakoff
AC Anderson
AD MacKerell
Ambrish Roy
Andreas Hofmann
AT Brunger
AT Brunger
BR Brooks
D Seeliger
E Lindahl
EL Ulrich
F Baud
FM Bickelhaupt
G Klebe
G Vriend
GD Rose
IK McDonald
J Chen
JM Word
KA Dill
L Pauling
LR Forrest
M Cohen
M Gochin
M Nilges
N Engler
RJ Read
RS Rowland
S Jones
Shakhnovich SaEI
SR Kimura
T Herrmann
V Pophristic
W Kabsch
W Wang
Y Li
Y Zhang
Yang Zhang
YQ Li
Yunqi Li
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Hydrogen constitutes nearly half of all atoms in proteins and their positions are essential for analyzing hydrogen-bonding interactions and refining atomic-level structures. However, most protein structures determined by experiments or computer prediction lack hydrogen coordinates. We present a new algorithm, HAAD, to predict the positions of hydrogen atoms based on the positions of heavy atoms. The algorithm is built on the basic rules of orbital hybridization followed by the optimization of steric repulsion and electrostatic interactions. We tested the algorithm using three independent data sets: ultra-high-resolution X-ray structures, structures determined by neutron diffraction, and NOE proton-proton distances. Compared with the widely used programs CHARMM and REDUCE, HAAD has a significantly higher accuracy, with the average RMSD of the predicted hydrogen atoms to the X-ray and neutron diffraction structures decreased by 26% and 11%, respectively. Furthermore, hydrogen atoms placed by HAAD have more matches with the NOE restraints and fewer clashes with heavy atoms. The average CPU cost by HAAD is 18 and 8 times lower than that of CHARMM and REDUCE, respectively. The significant advantage of HAAD in both the accuracy and the speed of the hydrogen additions should make HAAD a useful tool for the detailed study of protein structure and function. Both an executable and the source code of HAAD are freely available at http://zhang.bioinformatics.ku.edu/HAAD

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

KU ScholarWorks

PubMed Central

Delineation of VEGF-regulated genes and functions in the cervix of pregnant rodents by DNA microarray analysis

Author: A Newell
BC Timmons
C De Vries
CG Barclay
Chishimba N Mowa
CN Mowa
CN Mowa
DA Minjarez
DT Connolly
E Dejana
E El Maradny
ED Albrecht
EH Luque
G Celia
GC Liggins
Guichuan Hou
H Eagle
H Tokuda
Hans G Folkesson
HF Dvorak
HF Dvorak
HM Wu
HM Wu
J Anderson
J Folkman
J Rak
JC Schaer
JM Hastings
JM Krum
JRG Challis
KJ Straach
KR Brown
L Burger
L Dussably
L Ma
M Shibuya
MS Mahendroo
N Ferrara
N Ferrara
N Ferrara
N Weidner
PC Leppert
R Pfundt
RA Word
Raymond E Papka
RV Stan
S Epstein
S Esser
S Esser
S Harada
S Jesmin
Sharon E Usip
Subrina Jesmin
T Murohara
TC Laurent
Tianbo Li
VN Breeveld-Dwarkasing
Y Feng
Publication venue: BioMed Central
Publication date: 01/12/2008
Field of study

Abstract Background VEGF-regulated genes in the cervices of pregnant and non-pregnant rodents (rats and mice) were delineated by DNA microarray and Real Time PCR, after locally altering levels of or action of VEGF using VEGF agents, namely siRNA, VEGF receptor antagonist and mouse VEGF recombinant protein. Methods Tissues were analyzed by genome-wide DNA microarray analysis, Real-time and gel-based PCR, and SEM, to decipher VEGF function during cervical remodeling. Data were analyzed by EASE score (microarray) and ANOVA (Real Time PCR) followed by Scheffe's <it>F</it>-test for multiple comparisons. Results Of the 30,000 genes analyzed, about 4,200 genes were altered in expression by VEGF, i.e., expression of about 2,400 and 1,700 genes were down- and up-regulated, respectively. Based on EASE score, i.e., grouping of genes according to their biological process, cell component and molecular functions, a number of vascular- and non-vascular-related processes were found to be regulated by VEGF in the cervix, including immune response (including inflammatory), cell proliferation, protein kinase activity, and cell adhesion molecule activity. Of interest, mRNA levels of a select group of genes, known to or with potential to influence cervical remodeling were altered. For example, real time PCR analysis showed that levels of VCAM-1, a key molecule in leukocyte recruitment, endothelial adhesion, and subsequent trans-endothelial migration, were elevated about 10 folds by VEGF. Further, VEGF agents also altered mRNA levels of decorin, which is involved in cervical collagen fibrillogenesis, and expression of eNO, PLC and PKC mRNA, critical downstream mediators of VEGF. Of note, we show that VEGF may regulate cervical epithelial proliferation, as revealed by SEM. Conclusion These data are important in that they shed new insights in VEGF's possible roles and mechanisms in cervical events near-term, including cervical remodeling.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Investigation of Atomic Level Patterns in Protein—Small Ligand Interactions

Author: A Andreeva
A Kahraman
A Ochiai
A Teplyakov
AM Davis
B Huang
B Ma
BD Zoltowski
BJ Smith
Bostjan Kobe
C Angkawidjaja
D Jozic
D Liu
D Rajamani
DA Traoré
F Glaser
F Lan
G Scapin
G Wang
H Zhu
HM Berman
I Grishkovskaya
IK McDonald
J Carlsson
J Kyte
JE Coleman
JJ Ellis
JL Gifford
JM Word
JP Declercq
Ke Chen
Lukasz Kurgan
M Brylinski
M Hendlich
MM Benning
NJ Burgoyne
NM Luscombe
Q Ma
R Sankaranarayanan
RJ Zauhar
S Ansari
S Covarrubias
S Jones
S Vajda
T Dudev
T Dudev
U Stelzl
W Maret
Publication venue: Public Library of Science
Publication date: 16/02/2009
Field of study

BACKGROUND: Shape complementarity and non-covalent interactions are believed to drive protein-ligand interaction. To date protein-protein, protein-DNA, and protein-RNA interactions were systematically investigated, which is in contrast to interactions with small ligands. We investigate the role of covalent and non-covalent bonds in protein-small ligand interactions using a comprehensive dataset of 2,320 complexes. METHODOLOGY AND PRINCIPAL FINDINGS: We show that protein-ligand interactions are governed by different forces for different ligand types, i.e., protein-organic compound interactions are governed by hydrogen bonds, van der Waals contacts, and covalent bonds; protein-metal ion interactions are dominated by electrostatic force and coordination bonds; protein-anion interactions are established with electrostatic force, hydrogen bonds, and van der Waals contacts; and protein-inorganic cluster interactions are driven by coordination bonds. We extracted several frequently occurring atomic-level patterns concerning these interactions. For instance, 73% of investigated covalent bonds were summarized with just three patterns in which bonds are formed between thiol of Cys and carbon or sulfur atoms of ligands, and nitrogen of Lys and carbon of ligands. Similar patterns were found for the coordination bonds. Hydrogen bonds occur in 67% of protein-organic compound complexes and 66% of them are formed between NH- group of protein residues and oxygen atom of ligands. We quantify relative abundance of specific interaction types and discuss their characteristic features. The extracted protein-organic compound patterns are shown to complement and improve a geometric approach for prediction of binding sites. CONCLUSIONS AND SIGNIFICANCE: We show that for a given type (group) of ligands and type of the interaction force, majority of protein-ligand interactions are repetitive and could be summarized with several simple atomic-level patterns. We summarize and analyze 10 frequently occurring interaction patterns that cover 56% of all considered complexes and we show a practical application for the patterns that concerns interactions with organic compounds

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Comparison study on k-word statistical measures for protein: From sequence to 'sequence space'

Author: A Andreeva
A Bairoch
A Bateman
A Kelil
AP Bradley
B Rost
B Thiruv
BE Blaisdell
CH Wu
D Barthel
EM Taylor
F Pearl
F Ronquist
G Didier
G Fichant
G Reinert
GW Stuart
J Felsenstein
J Felsenstein
J Felsenstein
J Lowe
J Soppa
JM Word
JP Egan
JP Huelsenbeck
K Komatsu
KP Wu
LP Chew
M Hirano
M Sierk
N Cobbe
N Krasnogor
N Saitoh
P Ferragina
Qi Dai
S Hochreiter
S Kumar
S Vinga
S Vinga
SF Altschul
SF Altschul
TD Pham
TD Pham
Tianming Wang
TJ Wu
TJ Wu
W Li
Y Fujioka
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Many proposed statistical measures can efficiently compare protein sequence to further infer protein structure, function and evolutionary information. They share the same idea of using <it>k</it>-word frequencies of protein sequences. Given a protein sequence, the information on its related protein sequences hasn't been used for protein sequence comparison until now. This paper proposed a scheme to construct protein 'sequence space' which was associated with protein sequences related to the given protein, and the performances of statistical measures were compared when they explored the information on protein 'sequence space' or not. This paper also presented two statistical measures for protein: <it>gre.k </it>(generalized relative entropy) and <it>gsm.k </it>(gapped similarity measure). Results We tested statistical measures based on protein 'sequence space' or not with three data sets. This not only offers the systematic and quantitative experimental assessment of these statistical measures, but also naturally complements the available comparison of statistical measures based on protein sequence. Moreover, we compared our statistical measures with alignment-based measures and the existing statistical measures. The experiments were grouped into two sets. The first one, performed via ROC (Receiver Operating Curve) analysis, aims at assessing the intrinsic ability of the statistical measures to discriminate and classify protein sequences. The second set of the experiments aims at assessing how well our measure does in phylogenetic analysis. Based on the experiments, several conclusions can be drawn and, from them, novel valuable guidelines for the use of protein 'sequence space' and statistical measures were obtained. Conclusion Alignment-based measures have a clear advantage when the data is high redundant. The more efficient statistical measure is the novel <it>gsm.k </it>introduced by this article, the <it>cos.k </it>followed. When the data becomes less redundant, <it>gre.k </it>proposed by us achieves a better performance, but all the other measures perform poorly on classification tasks. Almost all the statistical measures achieve improvement by exploring the information on 'sequence space' as word's length increases, especially for less redundant data. The reasonable results of phylogenetic analysis confirm that <it>Gdis.k </it>based on 'sequence space' is a reliable measure for phylogenetic analysis. In summary, our quantitative analysis verifies that exploring the information on 'sequence space' is a promising way to improve the abilities of statistical measures for protein comparison.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central